Goto

Collaborating Authors

 Washington County


Language Models Represent Space and Time

arXiv.org Artificial Intelligence

The capabilities of large language models (LLMs) have sparked debate over whether such systems just learn an enormous collection of superficial statistics or a coherent model of the data generation process -- a world model. We find preliminary evidence for the latter by analyzing the learned representations of three spatial datasets (world, US, NYC places) and three temporal datasets (historical figures, artworks, news headlines) in the Llama-2 family of models. We discover that LLMs learn linear representations of space and time across multiple scales. These representations are robust to prompting variations and unified across different entity types (e.g. cities and landmarks). In addition, we identify individual ``space neurons'' and ``time neurons'' that reliably encode spatial and temporal coordinates. While further investigation is needed, our results suggest modern LLMs learn rich spatiotemporal representations of the real world and possess basic ingredients of a world model.


Amazon Web Services AI exec: How cloud computing is driving artificial intelligence breakthroughs

#artificialintelligence

Artificial intelligence research is still in its infancy, at least as compared to computer science in general, but the concept of unlimited computing resources is accelerating the field. As someone with nearly unlimited computing resources at his disposal, this is something Swami Sivasubramanian, vice president of AI at Amazon Web Services, is watching play out. Last week Sivasubramanian walked GeekWire Cloud Tech Summit attendees through the array of artificial intelligence and machine-learning services that his team has developed for AWS customers and Amazon's own internal services as well. If you've been through a few tech cycles, you've already heard a lot about artificial intelligence. Much has been promised from this research field over several decades, but the enormous amount of data now moving into cloud computing services like AWS and others allows researchers like Sivasubramanian to make real breakthroughs that weren't possible when data sets were scattered and siloed.